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ABSTRACT 

This paper describes the 2-year evolution of an 
ongoing program of inservice training for public school personnel 
designed to train observers in the use of a systematic observation 
technique— the Reciprocal Category System. The 605 participants 
include teachers, librarians, administrators, and central office 
personnel. Nine aspects of the four training sessions are discussed 
with respect to the three major headings of M organi zation , " "training 
procedure," and "observer competence." Under the category of 
organization, training time, group size, and training sequence are 
considered. A training time of two days is developed, while a 
comparison of results by group size indicates that large groups are 
as effective as small groups.. Training procedural concerns include 
development of a manual, revised twice; content, which shifts 'in 
mphasis from concepts and theory to skills and calculations; and 
rainer use. Observer competence elements include types of 
instruments a concept test and a skill test — and the improvement 
made in these two instruments. The paper concludes that it has been 
found possible to train large groups of observers in two days to a 
responsible level of accuracy in the use of the Reciprocal Category 
System for classroom observation for purposes of classroom 
self-analysis. (RT) 
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Introduction 

Systematic observation is a data collection procedure with 
recognized usefulness in a variety of settings, ranging from controlled 
experimental research to self-evaluation efforts of classroom teachers. 
Reports of research using interaction analysis and other observational 



techniques have included little discussion of the major problem in 
using observation**-the training of observers. It is generally assumed 
that training observers to an acceptable level of accuracy takes a 
considerable amount of time, so as to preclude the use of systematic 
observation in areas where it would be most appropriate. Besides its 
obvious function in empirical research, observational systems can be 
useful tools in the supervision and self-evaluation of teachers. 

It is the purpose of this paper to describe the two year evolution 
of an ongoing program of inservice training for public school personnel 
designed to train observers in the use of a systematic observation 
technique, with a focus toward the use of observation as a methodology 
for teacher self-evaluation. Three primary issues are considered in 
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1« The development of training procedures to produce maximum 
proficiency of observers with the most economical use of 
time and training personnel. 

2. The development and testing of training materials. 

3. The development of methods to assess observer competency. 

Ge neral Description of the Training Sessions 

The Reciprocal Category System (RCS) developed by Richard Ober 
was chosen as the observational system around which the training pro- 
gram trould be designed. Ober's RCG is one of several systems that 
focus on an analysis and description of classroom verbal interaction. 

Four training sessions have been conducted in the Metro Atlanta 
area during the past two years. A total of 605 trainees have partici- 
pated in these sessions. The sessions were conducted during the summer 
of 1968 (S-68) , the winter of 1969 (*7-69) , summer of 1969 (S-69) , and 
winter of 1970 (T'7-70). Participants have represented the range of 
public school assignments, including: teachers, librarians, adminis- 

trators and central office personnel. They were selected for attendance 
by their respective school systems on both a voluntary and assigned 
basis. Participants were of both sexes and represented various ethnic 
groups . 

Various aspects of the training sessions have been developed and 
modified over the period that the training was conducted. Decisions to 
make modifications were based on both subjective judgments and empirical 
data. In presenting the evolution of this program, the plan is to 
present the salient features of each of the four training sessions in 
terms of nine major aspects of training, focusing primarily on those 
modifications that seem to have the widest implications and greatest 






generalizability . Where available, empirical data are presented to 
support certain training modifications? at other times, the subjective- 
judgments that led to changes are reported. The nine aspects of 
training are discussed with respect to the three major headings of 
Organization, Training Procedures, and Observer Competence. Figure 1 
summarizes the salient features ©£ the four training sessions with 
respect to the nine aspects of training. 

Organization 

Three aspects of organization were considered — training time, 
group size, and training sequence. Training time refers to the number 
of hours required in training participants . After the first session, 
S-68, which was conducted over three days, training sessions were 
reduced to 12 hours or two days. Table l shows the effect of this time 
reduction on accuracy of observer codings of a taped sample of classrooit 



Table 1 

Comparison of Three~and Two-Day Training Sessions 
on Criterion Tape Accuracy* 







S-68 


w-69 




V 


76 


208 


% 


Above .60 


62% 


50% 


% 


Above .50 


77% 


71% 



* Accuracy coefficients wore computed using Scott's method for 
observer reliability. Scott’s procedure essentially provides 
an index of observer agreement. The term accuracy coefficient is 
used here since each trainee's observations wore compared with 
his trainer's coding of the criterion tape. In this sense the 
coefficient is more nearly like a validity estimate than a 
reliability estimate. 
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verbal behavior. The reduction in the level of trainee accuracy was 
not deemed sufficient to continue the three day schedule. These 
results should be viewed cautiouslv since factors affecting accuracy 
scores, in addition to reduction in trainincr time, included several 
other changes in procedures. 

Group size as a variable has been varied in the following manner n 
(1) combination of large group-small group, with small groups (annroxi*" 
mate n of 25) for skill instruction? (2) small groups varving in size 
from 15 to 40 participants? (3) large groups (n of 94) versus small 
groups (approximate n's of 30) for self-contained instruction? and, 

(4) standard small groups of 20 to 25 participants. Table 2 reports 
results of a large group, small group comparison where the procedures 
were as nearly equivalent as possible, while differences for both 

^able 2 

Comparison of Trainee Performance in Large and Small) 

Group Training Sessions 





Large Group 


small Group 


Sample size 


N=9 4 


*7=31 


Concept Test 






Mean 


22.3 


20.0* 


S.D. 


5.2 


4.6 


Criterion Tape 






% Accuracy Scores 
above ,60 


55.3% 


51.6% 




* Difference be -ween the two means significant at the .05 level. 
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criteria favor the large group, it is more important to note that they 
do not favor small group size . This implies an important economic 
savings in training „ 

T raining sequences have involved fixed, variable and standard 
approaches. Earlv attemnts included a fixed schedule involving all 
participants in the same activities (though at times as one large 
croup and at other times as small shill instruction croups) . During 
T,T ~69 a variable sequence mas employed, because trainers had not been 
satisfied with procedures of organization for the ,«?- 6B session. The 
variable sequence mas achieved by teaming instructors, dividing 
participants into tmo grouns, and flip-flopping the activities. The 
groups participated in a different instructional sequence. Subjective 
trainer judgment and dissatisfaction with this "team teaching" approach 
led to the "standard sequence" utilized mith the FI-69 and W-70 partici- 
pants. Each trainer conducts the entire set of training activities 
with a single group. 

Trainin g Procedures 

Three main procedural concerns mere development of a training 
manual, ordering and presenting content in training sessions, and 
trainer differences. The t raining manual evolved through three 
states to its present, form, version 1 mas a collection of articles 
and interpretive comment, version 2 mas written specifically f or 
training purposes based largely on Richard Ober’s fo rkin g Manual 
(incorporated in Version 1) . version 3 moved toward a semi -programmed 
revision of the second manual. Refinement of manual content is 



continuing • The manual has become an integral part of the training 
materials and provides a decree of standardization of training 
procedures „ 

Content originally was separated into concepts and theory, and 
shills and calculation (S -68). Based on trainers' opinions that the 
emphasis should be on applications of skills, T, 7-69 sessions devoted 
less time to theorv and calculations and more time to coding skills* 
During c ~69 increased emphasis was placed on observer performance 
skill? the only change in r<ir ~ 7 0 resulted in an increase in practice 
time. This additional practice will be examined bv comparing fi-69 
with TT ~70 for gross indications of this modification as soon as rT -70 
data can be analyzed. 

The effects of individual trainers were appraised in the initial 
soBoion Cf5*-f58> . Results indicated that for the original trainers 
there were no apparent differences. Since these trainers continued to 
serve as the core instructors through the series no additional investi- 
gations have been attempted. Once procedures are standardized, 
subsequent, investigations can examine trainer differences. Division 
of responsibility has been varied with initial procedures (P-6S) 
requiring one trainer for 5 total group to handle theoretical 
considerations and individual groixn loaders for skill instruction. 

T ’ r ~70 sessions utilized special division of tasks such that cooperating 
trainers handled either data collection activities for all participants 
(in small groups) cr preparation and interpretation activities. By 
$-69 trainers director the entire seguence for a specified group of 
participants: this procedure was continued in w-70 sessions. The 
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evolution of a complete secruence per trainer approach was based on 
judgments of trainers in post mortem discussions of the training 
sessions . 

Observer Competence 

Instruments for assessing observer proficiency have all along 
been considered of major importance, ^our types of instruments were 
considered initially (ft-68) . these included: (1) A Concept Test 

stressing under s t and 1 per of the major aspects o^ the system? (?) Cri- 
terion Tapes which focused on extent to which a trainee could accurately 
code the verbal interaction depicted on the audio tapes ? (3) A shill 
'T’est which was concerned primarily with data preparation and interpre- 
tation? and» (4) A filial Exam which was essentially a combination of 
content covered by the Skill Test and the Concept Test. 

Analyses of data for FI-68 indicated that the two ^ost promising 
evaluative instruments were the Concept Test and the Criterion Tape 
accuracy scores. Scores from these two devices showed low correlations 
(r«*. 1.4) Indicating different aspects of competency were being assessed. 
In addition^ a substantial correlation (r=.74) between the Fkill Test 
and the Pinal Exam indicated that in the interest of time one of these 
instruments could be eliminated. It was decided to eliminate the Final 
Exam entirely and to incorporate the Skill Test into the training 
procedures, ^he subsequent sessions used only the Concent Test and 
Criterion Tape methods to assess observer competence. 

‘fk© Concept Test underwent two revisions based on 3-6 fl and w-69 
data and another revision is Planned based on ™~70 data that will 
involve increasing test length in order to increase test reliability. 
Table 3 summarized the statistical data for the different versions of 
the Concept Test. 
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Table 3 



Statistical 


Data 


for the Different 


Versions 


of the Concept Test 






S-68 


w~69 


fl-69 


T<7 “70 


Version Ho. 




1 


2 


3 


3 


Mo. Items 




25 


36 


38 


38 


bean 




16 . 84 


21,52 


21.14 


23.65 


s.n. 




3.10 


5.06 


4.99 


4.97 


Reliability (K p 




.59? 


.732 


.712 


.732 


Sample ^ize 




76 


228 


251 


81 



The development of a Cr iterion Tane has aone through several 
stages. The first tane to be used (Version 1) was not only used to 



assess accuracy in coding observations, but was also used in the 
training procedures. This practice was somewhat questionable and was 
discarded after the T ' 7 -69 session. Another reason for revising the 
Criterion Tape and evaluation procedures was that in calculating 
observer accuracy (for sessions fl-68 and »>-69) , each participant was 
compared with his own group trainer* s coding. 

To improve on some o^ these deficiencies a second tane was 



developed, (version 2) to used only for final competency evaluation. 
.Additionallv, a standard "kev” was constructed so that each trainee 
accuracv score would, be base' 5 * on t*e same comparison. This ''key" was 
developed from a composite of the ratings of five judges who had served 



as trainers. The intra judge reliabilities 
shown in Table 4. 



for these five judges is 
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Table 4 

Intra-Judge Reliability* of Ratinas for Version 2 

of the Criterion Tape 



Judge No. 


1 


2 


3 


2 


.66 






3 


.85 


.73 




4 


.69 


.76 


.77 


5 


.80 


.65 


.86 



* Reliability of judaes ratinas computed bv Scott's coefficient 
procedure 

m here are several cateaories of verbal behavior measured by the 
that "ere not represented in Version 2 of the criterion tape. It 
r ' / as felt that another tape should, be developed that better represented 
all of the catecrories in the RCP system. This tape has been developed 
(Version 3, T,7 -70) but the data have not yet been processed. 

Current St atus 

Two year's of schedule refinement have reduced the original 
procedures from 18 hours of instruction and testing to 12 hours of 
instruction and criterion performance tasks . Savinas in time and. paper 
work resulted largely from the finding that the concept criterion score 
and accuracv score predicted substantially what was identified through 
paper and pencil skill tests and final competence checks. A current 
training schedule is presented as figure 2. Th.e schedule evidences 
the critical importance of accuracy in coding interactions. Accuracy 
exercises on the second day are designed to compare participant coded 
data with the criterion score on the same audio tape. Only two seg- 
ments -Data Preparation and Data Interpretation- -focus on other aspects 
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figure 2 

Schedule of the T7 ost Recent Training Session (m- 70 ) 



Thursday, January 22, 1970 


Friday 


, January 23, 1970 


3; 45 


Xntrodution to 


8 .45 


Collection of live data 




Beginning Data Collection 


9 s 00 


M icro-lesson (Simulation) 


10 * 30 


Coffee 


10 s 15 


Coffee 


10-40 


Response Discrimination 
(15 1 s & 16 * s) 


10 ;25 


Data Interpretation 


11*30 


Intermediate Data Collection 


12 1 00 


.accuracy Exercise 


12-30 


Lunch 


12:30 


Lunch 


M5 


Data Preparation 


1:15 


advanced Data Collection 


2?30 


Intermediate Data Collection 


2:30 


accuracy ^xercise 


3 u 00 


To be continued 


3 i 00 


Ovnthesis 



of observer competence. This does not imply lack of importance of data 
use. Rather, it focuses the participant on "first things first," the 
task of accurately observing and recording verbal behavior. 

Data collection is approached from three complementary veiws : 
practice coding of prepared audio tapes, practice coding of live 
simulated situations, and practice in eliciting selected verbal 
behaviors through individually planned minute lessons (based on a 
priori scripts of sequences of categories) as well as coding these 
simulated lessons. Comparisons in all cases are against "expert” 
criterion scores. 

Data preparation skill sessions include techniques of summari- 
zation (bracketing, plotting, and totaling) and calculations (percent- 
ages and ratios) . Data interpretation sessions emphasize identification 
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of important aspects of data sequences and dumpings based on instruc- 
tional plans of the teacher f (patterns of verbal behavior and 
concentrations) , 

After two years of planned successive approximations it is 
possible to report that large group observers can be trained to a 
responsible level of accuracy and educated to reasonable theoretical 
and conceptual understanding within 12 hours of training for purposes 
of classroom self-analysis through the Reciprocal Category fystem of 
Interaction Analysis. 



